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(57) Abstract 

The present invention contemplates tlie use of either generalized recombination, or more preferably, site-specific recombi- 
nabon to faahtate the sequence or fragment analysis of DNA molecules. Most preferably, the site-specific recombination system 
ot oactenophage PI is employed to fadlitate sudi sequence analysis. 
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TITLE OF THE IWVBliTION : 

RECOMBINATION-FACILITATED MULTIPLEX ANALYSIS OF 

DNA FRAGMEWTfjl 

VTRTTi nv n» qE IHVENTTOIf i 

5 The invention relates to the use of recombinases, and 

in particular, the Ore protein of bacteriophage Pi and its 
loj^ DNA recombinational site, to facilitate the 
restriction fragment analysis and sequencing of a DNA 
molecule. 

10 BACKGROUND OP TOB IHVEHTTQN i 

The techniques of molecular biology were developed to 
analyze relatively small DNA molecules. Increasingly, 
however, research has centered on the analysis of larger 
and larger DNA molecules, such as the chromosomes of 

15 mammals, and in particular, the chromosomes of the human 
genome. The analysis of such large DNA molecules has 
often been limited by the ease with which the initially 
developed technology could be adapted to permit the 
analysis of such extremely large molecules. Such methods 

20 have exploited the ability of restriction endonucleases to 
produce fragments of the DNA molecule which would then be 
more amenable to sequence analysis. 
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I. THE MaPPING OF ' RESTRICTION ENDONUCLEASE 
RECOGNITION SITES 

One initial objective in the analysis of DNA 
molecules is to produce a gross physical map of the DNA. ^ 
5 For small DNA molecules, this may be readily achieved 
using restriction endonucleases to identify and orient the 
corresponding recognition sites of such enzymes. Methods 
for performing such "restriction mapping" are well-known 
(see, for example, Perbal, B. A Practical Guide to 
10 Molecular Cloning ,. John Wiley & Sons, NY, (1984), pp, 208- 
216; Maniatis, T., et al, (Int Molecular Cloning. A 
Laboratory Manual ^ Cold Spring Harbor Press, Cold Spring 
Harbor, NY (1982) , both herein incorporated by reference) . 
As will be apparent, the complexity of the data 
15 obtained in "restriction mapping" a target molecule 
increases rapidly as the size of the target molecule 
increases. For this reason, it is usually necessary to 
employ several strategies when attempting to obtain 
detailed maps of a large DNA molecule. 
20 One strategy which is employed involves 

simultaneously digesting a target DNA molecule with 
combinations of several restriction enzymes each of which 
is expected to cleave at only a small number of sites in 
the target. For linear target molecules, the number of 
25 fragments equals the number of restriction enzyme sites 
plus 1. For a circular DNA molecule, the number of 
double-digestion fragments equals the number of fragments 
generated by the first enzyme plus the number of fragments 
generated by the second enzyme. Restriction maps are 
30 created from the data by a process which is part logic, 
and part trial and error (Lawn, R.M. et al . . Cell 15:1157 
(1978)). 

The process of creating a restriction loap may be ^ 
facilitated by a sequential analysis of fragments. In 
35 this method, one treats a target molecule with a first 
restriction endonuclease, isolates the digestion products. 
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and then subjects the purified products to digestion with 
a second endonuclease. Such steps can be performed 
rapidly and efficiently (Parker, R.c. et al . . Met. 

Enzvmol. 651358 (1980)). 

In lieu of obtaining a complete endonuclease 
digestion of a target molecule, considerable information 
can be obtained by incubating the target molecule with 
endonucleases under conditions resultdLng in limited 
digestion. Two approaches have been developed. In the 
first, the aim is to compare the sizes of the partial- and 
complete-digestion products with one another, and to 
deduce which fragments might be adjacent to one another in 
the target molecule. In general, the number of pe^^ial- 
digestion products (F) of a linear DNA molecule that 
15 contains N+l restriction sites (where N>0) is given by the 
formula; 

■^Vi — 

For a molecule having 20 sites, 209 partial-digestion 
products are obtainable. Thus, for large DNA molecules, 
the method is of very limited utility since the number of 
20 possible partial-digestion products quickly becomes 
xwmanageable. 

One means for simplifying such an analysis was 
proposed by Smith, H.O. et al.. Nucleic Acirtp Po ^ 
1:2387 (1976), herein incorporated by reference). This 

25 method uses a target molecule which has been labeled with 
at one of its termini. Digestion products are 
visualized by autoradiography after electrophoresis in 
agarose gels. Digestion products which are not linked to 
the labelled termini are not detected by the analysis. 

30 Thus, the number of labelled partial-digestion products is 
equal to the number of restriction sites within the target 
molecule. Moreover, the labelled fragments form a simple, 
overlapping ladder, with a common labelled terminus. The 
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order of ascension of the fragments corresponds to the 
order of restriction sites In the teurget molecule. 
Lastly, partial-digestion products produced through the * 
action of several different enzymes can be analyzed 
5 simultaneously on the same gel. 

One deficiency of the method is the difficulty which 
is often encoxuitered in specifically labelling a single 
end of a target molecule. Indeed, due to the symmetrical 
nature of a double-stranded DNA molecule, both ends of the 
10 molecule are equally available for leQ3elling. Thus, in 
practice, considereQDle difficulty is encountered in 
labelling only one end of a linear DNA molecule. 

XI. THE SEQUENCING OF DNA MOLECDLES 

Initial attempts to determine the sequence of a DNA 

15 molecule were extensions of techniques which had been 
initially developed to permit the sequencing of RNA 
molecules (Sanger, P., J. Mol. BioI> 11:373 (1965); 
Brownlee, G.G. et al. , J. V^ol. piol, M:379 (1968)). Such 
methods involved the specific cleavage of DNA into smaller 

20 fragments by (1) enzymatic digestion (Robertson, H.D. st 
al.. Nature New Biol. 2il:38 (1973); Ziff, E.B. et al . . 
yature Ney Biol, 211:34 (1973)); (2) nearest neighbor 
analysis (Wu, R. , et al .. J. Mol. Biol. 57 ; 491 (1971)), 
and (3) the "Wanderings Spot" method (Sanger, F. , Proc. 

25 Natl. Acad. Scl> rU.S,A>) 70:1209 (1973)). 

Hore recent advances in DNA sequencing have led to 
the development of tvo , highly utilized methods for 
elucidating the sequence of a DNA molecule: the "Dideo3cy- 
Hediated Chain Termination Method,'' also known as the 

30 "Sanger Method" (Sanger, P., et al., J. Mol. Biol, qasaai 
(1975)) and the "Maxam-Gllbert Chemical Degradation 
Method" (Maxam, A.M., et al . ^ Proc. Natl. Acad. Sci. 
fU.s.A.) 21:560 (1977), both references herein 
incojrporated by reference) • 
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A. DIDEOXY-MEDIATfiD CHAIN TERMINATION METHOD 
OF ONA SEQUENCING 

In the dideoxy-mediated or "Sanger" chain termination 
method of DNA sequencing, the sequence of a DNA molecule 
5 is obtained through the extension of an oligonucleotide 
primer which is hybridized to the nucleic acid molecule 
being sequenced. In brief, four separate primer extension 
reactions are conducted. In each reaction, a DNA 
polymerase is added along with the four nucleotide 
10 triphosphates needed to polymerize DNA. Each of the 
reactions is carried out in the additional presence of a 
2 ',3' dideoxy derivative of the A, T, C, or 6 nucleoside 
triphosphates. Such derivatives differ from conventional 
nucleotide triphosphates in that they lack a hydroxyl 
15 residue at the 3' position of deoxyribose. Thus, although 
they can be incorporated by a DNA polymerase into the 
newly synthesized primer extension, the absence of the 3» 
hydroxyl group causes them to be incapable of forming a 
phosphodlester bond with a succeeding nucleotide 
20 triphosphate. Thus, the incorporation of a dideoxy 
derivative results in the termination of the extension 
reaction. Since the dideoxy derivatives are present in 
lower concentrations than their corresponding, 
conventional nucleotide triphosphate analogs, the net 
result of each of the four reactions is to produce a set 
of nested oligonucleotides each of which is terminated by 
the particular dideoxy derivative used in the reaction. 
By subjecting the reaction products of each of the 
extension reactions to electrophoresis, it is possible to 
30 obtain a series of four "ladders." since the position of 
each "rung" of the ladder is determined by the size of the 
molecule, and since such size is determined by the 
incorporation of the dideoxy derivative, the appearance 
and location of a particular "rung" can be readily 
35 translated into the sequence of the extended primer. 



25 
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Thus, through an electrophoretic analysis / the sequence of 
the extended primer can be determined. 

One deficient of the dideoxy-mediated sequencing 
method is the need to optimize the ratio of dideoxy 
5 nucleoside triphosphates to conventional nucleoside 
triphosphates in the chain-extension/chain-termination 
reactions . Such adjustments are needed in order to 
maximize the amount of information which can be obtained 
from each primer. Additionally, the efficiency of dideoxy 

10 nucleotide incorporation in a particular target molecule 
is partially dependent upon the primary and secondary 
structures of the target. 

The dideoxy-mediated method thus requires single- 
stranded templates, specific oligonucleotide primers, and 

15 high quality preparations of a DHA polymerase (typically 
the Klenow fragment of coli DNA polymerase I). 

Initially, these requirements delayed the wide spread use 
of the method. However, with the ready availability of 
synthetic primers, and the availability of bacteriophage 

20 M13 and phagemid vectors (Maniatis, T. , et al. , Mo],eculay 
Cloning, a LeOaoratorv Manual. 2nd Edition > Cold Soring 
Harbor Press . Cold Spring Harbor, New York (1989) , herein 
incorporated by reference) , the dideoxy-mediated chain 
termination method is now extensively employed. 

25 B. THE M&XftM-GILBERT METHOD OF DNA SEQUENCING 

The Haxam-Gilbert method of DNA sequencing is a 
degradative method. In this procedure, a fragment of DNA 
is labeled at one end and partially cleaved in four 
separate chemical reactions, each of which is specific for 

30 cleaving the DNA molecule at a particular base (G or C) at 
a particular type of base (A/G, C/T, or A>C) • As in the 
above-described dideoxy method, the effect of such 
reactions is to create a set of nested molecules whose 
lengths are determined by the locations of a particular 

35 base along the length of the DNA molecule being sequenced. 
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The neslfed reaction products are then resolved by electro- 
phoresis, and the end-leUdeled molecules are detected, 
typically by autoradiography when a '^p label is employed. 
Foiar single lanes are typically required in order to 
5 determine the sequence* 

The Kaxam-Gilbert method thus uses simple chemical 
reagents which are readily available. Nevertheless, the 
dideoxy-mediated method has several advantages over the 
Maxam-Gilbert method. The Maxam-Gilberi: method is 

10 extremely laborious and requires meticulous experimental 
technique. In contrast, the Sanger method may be employed 
on larger nucleic acid molecules. 

Significantly, in the Maxam-Gilbert method the 
sequence is obtained from the original DNA molecule, and 

15 not from an enzymatic copy. For this reason, the method 
can be used to sequence synthetic oligonucleotides, and to 
analyze DNA modifications such as methylation, etc. It 
can also be used to study both DNA secondary structure and 
protein-DNA interactions. Indeed, it has been readily 

20 employed in the identification of the binding sites of DNA 
binding proteins. 

Methods for sequencing DNA using either the dideoj^- 
mediated method or the Maxam-Gilbert method are widely 
known to those of ordinary skill in the art. Such methods 

25 are, for example, disclosed in Maniatis, T. , e£_al. , 
Molecular Cloning, a Laboratory Manual, an d EdltioT^ , Cold 
Spying Hayboy Pyess, Cold Spring Harbor, New York (1989) , 
and in Zyskind, J.W., et_al.. Recombinant DNA LahoT-^^o^ 
Manual, Academic Press. Inc., New York (1988) , both herein 

30 incorporated by reference. 
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III. THE ANALYSIS OF LA£GE DNA SEQUENCES 

Both the above-described dideoxy-medlated method cuid 
the Maxam-Gllbert method of DNA sequencing recpiire the 
prior isolation of the DNA molecule which is to be 
5 secpienced. The secpience information is obtained by 
subjecting the reaction products to electrophoretic 
analysis (typically using polyacrylamide gels) . Thus, a 
sample is applied to a lane of a gel, and the various 
species of nested fragments are separated from one another 
10 by their migration velocity through the gel* The number 
of nested fragments xAich can be separated in a single 
lane is appro3cimately 200-300 regardless of whether the 
Sanger or the Haxam-Gilbert method is used. Those of 
great skill in the art can separate up to 600 fragments in 
15 a single lane. Thus, in order to sequence large DNA 
molecules, it is necessEury to fragment the molecule, and 
to sequence the fragments din separate lanes of the 
sequencing gel. The sequence of the entire molecule is 
obtained by orienting and ordering the sequence data 
20 obtained from each fragment. 

Two approaches have been employed by those of skill 
in this art to accomplish this goal. In a random or 
shotgun seqpiencing approach, sequence data is collected by 
subcloning fragments of the target DNA molecule. No 
2 5 attempt is initially made to determine the linear 
orientation or order of the subclones with respect to the 
intact target DNA molecule. Instead, the accumulated data 
are stored and ultimately eirranged into order by a 
computer (Staden, R. , Nucleic Acids Res. 14 ! 217 (1986); 
30 Anderson, S. et al .. Nature 290 :457 (1981); Gingeras, 
T.R., J. piol. Chem. 251; 13475 (1982); Sanger, F. et al ., 
J, ml. Bjol, A£2;729 (1982), and Baer, R. et al ., Nature 
110:207 (1984)). As will be appreciated, such random 
shotgim approaches often result in the multiple sequencing 
35 of the same oligonucleotide fragment, and thus are often 
inefficient in terms of time and materials. 
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In contrast, directed approaches have been employed 
in which sequences of the target DNA are obtained in a 
systematic fashion. For example, the target DNA molecule 
may be ordered by restriction mapping usdLng the methods 
5 described above, and the discrete restriction fragments 
sequenced. Alternatively, the target molecule may be 
sequenced by sequencing nested sets of deletions which 
begin at one of its ends. The use of such nested 
fragments progressively brings more and more remote 
10 regions of the target DNA into range for sequencing. 
Lastly, sequence information obtained from a particular 
target molecule can be used to prepare a primer which can 
then be used in a subsequent sequencing reaction in order 
to obtain additional sequence information. As will be 
15 perceived, a directed sequence analysis of a target DNA 
molecule often requires substantial a priori information 
regarding the sequence. Moreover, for large target 
molecules (of sizes on the order of kilobases) such as 
would be encountered in the sequencing of eukaaryotic (and 
20 in particular, mammalian) chromosomes, directional 
sequencing is quite arduous. 

Several strategies have been developed to facilitate 
the sequence analysis of large (multi-kilobase) gene 
sequences. In one strategy, a large DNA molecule is 
25 fragmented through the use of restriction endonucleases 
which cut at infrequent sites. Such action results in the 
production of a small number of fragments each of which 
contains a portion of the sequence present in the original 
DNA molecule. Due to their smaller size, such fragments 
30 are more amenable to sequence analysis (using the above- 
stated methods) than the original DNA molecule. The 
sequence of the entire molecule is obtained by orienting 
the fragments with respect to each other in order to 
produce a gross physipal map of the target molecule 
i5 (Schwartz, D.c. et al.. Cell 37:67-75 (1984); Southern, 
E.M. et ^3,., Nucleic Acids Res, 1^:5925-5943 (1987); 
Burke, D. et aj,., Ssiencg 236:808-812 (1987); Olson, M.V. 
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at al .. i»^o«. watl. Anad. Sci, fXT.g.A.) £2:7826-7830 
(1986) ) . Since this procedure reveals both the sequence 
and the orientation of the fragments, it permits one to 
readily determine the sequence of the entire DNA molecule. 
5 Alternatively, a large DNA target may be subcloned 

into a large number of randomly selected bacteriophage or 
cosmid clones. Overlapping sequences in such clones are 
identified by unique restriction enzyme "fingerprinting" 
(Olson, H.V. at al .. Proe. Natl. Acad. Sci. — fiy,S»ftt) 

10 12:7826-7830 (1986); Coulson, A. at al. . PfQC. ya1;J,. Ac^dt 
sci. m. S.A.I 83; 7821-7825 (1986)). The information is 
then used to assemble a map of overlapping sets of clones 
(Staden, R., wneieie Acids Res. 8:3673-3694 (1980)). This 
method has been successful in generating complete or par- 

15 tial maps of saceharomyces cer avisiae chromosomes (Olson, 
M.V. at al . . PTQc. Natl. Aead. Se t. m.S.A.^ 83; 7826-7830 
(1986) ; c. Eleaans (Coulson, A. et al . , Proc. Natl. Acad. 
set. riT.S.A.l 51:7821-7825 (1986); Coulson, A. ^t ^l* f 
Nature 335; 184-186 (1988)); and the E. coli chromosome 

20 (Kohara, Y. et al . . Cell 50;495-508 (1987)). 

Several factors may limit the use of conventional 
methods in the analysis of the nucleotide sequence of a 
target molecule. Typically, each lane of a sequencing gel 
can resolve only about 300 different fragments. Thus, in 

25 order to determine the nucleotide sequence of a large DNA 
molecule, multiple sequencing gels are often needed. 
This, in turn, limits the amount of new sequence 
information which can be readily obtained per day. For a 
Isirge nucleic acid molecule, a substantial number of 

30 technically demanding and time consuming steps must be 
performed. In particular, since the above-described 
techniques are capable of analyzing only one set of nested 
oligonucleotides per sample, the sequencing of large DNA 
molecules requires the use of multiple sequencing gels 

35 eacai having a large number of lanes. The electrophoretic 
analysis step in the sequencing process thus comprises a 
significant limitation to the amount of sequence 
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10 



15 



information which can be obtained and the rate with which 
it can be processed. 

Similarly, the use of conventional methods in the 
analysis of restriction endonuclease-induced fragments of 
a target molecule is often not straightforward. 

In some cases, sequence and restriction fragment 
analysis is limited by the low copy number of each target 
DNA sequence in natural genomes. one method for 
overcoming this limitation is through the use of 
amplification techniques, such as the polymerase chain 
reaction, "PCR" (Mullis, K. et a^,. Cold St>,-in» B;».w 
gVffiP. Cmftnt. Biol. 51:263-273 (1986); Erlich H. et al. , ep 
50,424; EP 84,796, BP 258,017, EP 237,362; Mullis, K. , EP 
201,184; Mullis K. e£_aa^, US 4,683,202; Erlich, H., US 
4,582,788; and Sailci, R. et al. . US 4,683,194), which 
references are incorporated herein by reference), the 
technique cannot readily be applied to amplify every 
target molecule present in a large gene sequence. A 
method for detecting and/or measuring PCR ainplif ication is 
disclosed by Brenner, S. et a],., in International Patent 
Application Publication No. WO/11375. This method entails 
linking a l22ffi sequence to a target molecule during PCR 
amplification. Amplification is detected by incubating 
the reacted molecules with Cre. 

25 IV. MULTIPLEX ANALYSIS 

A substantial improvement in DNA sequencing 
technology was recently developed, and designated 
"multiplex DNA sequencing" (Church, G.M., et al . , Science 
2M:185-188 (1988); Church, G.M. et a^ ., U.S. Patent 

30 4,942,124; both herein incorporated by reference). 
Multiplex DNA sequencing utilizes DNA libraries which are 
individually constructed in 20 different plasmid vectors. 
In addition to standard drug resistance and replication 
origin elements, each of the vectors has a cloning site 

35 flanked by two different, predefined oligonucleotide 



20 
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"tags" (i.e., forty total tags sure used with twenty 
vectors) • These tags are in turn f Ismked by sites 
recognized by the NotI restriction endonuclease (which 
cuts only at infrequent sites) . The vectors differ from 
5 each other only by their tag sequences, which are 
originally selected f ron a random collection of chemically 
synthesized oligonucleotides. 

In accordance with the method, ONA is sonicated to 
produce fragments of 900-1500 base pairs. Such DNA is 

10 rendered ligatable through treatment with Bal 31 
exonuclease and then with T4 DNA polymerase and all four 
deoxynucleotide triphosphates . The DNA fragments are then 
ligated separately into each of the vectors and the 
ligation mixtures are used to transform E. coli cells. 

15 This procedure thtis results in a formation of 20 gene 
libraries, which can then be amplified by conventional 
means. After amplification, the vectors are treated with 
NotI in order to excise the cloned DNA which is to be 
sequenced. Such excision produces DNA molecules having 

20 termini which ctre appropriate for the required subsequent 
chemical sequencing* The cloned DNA from each of the 
libraries is then mixed together to form a single pool 
containing each of the twenty members of the library. 

The secpience of the cloned DNA of the libraries is 

25 determined using the Haxam-Gilbert method. The pool of 20 
libraries is treated as a single unit in accordance with 
that method. The reaction products are then applied to a 
sequencing gel, and the oligonucleotides in the DNA sample 
are sepsurated using gel electrophoresis. The DNA 

30 patterns, thus obtained, are then electro-transferred from 
the gels onto nylon membranes and crosslinked to the 
membranes using DV light. 

Since each lane of the gel contains the reaction 
products of the sequencing of 20 different DNA molecules, 

35 each lane contains 20 overlaid ladders of sequence 
information. Because the NotI fragment of the cloned DNA 
contains the tag region of the vector, each oligonucleo- 
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tide of a particular sequence ladder contains the tag 
region. A particular sequence ladder may thus be 
visualized by hybridizing a labelled probe for a 
particular tag to the DNA bound to the nembrane. By 
washing the membrane with sodium dodecyl sulfate and EDTA, 
it is possible to remove the hybridized probe from the 
membranes. This step thus prepares the membranes to be 
used to analyze a second sequence ladder by hybridizing a 
labelled probe for a second particular tag to the DNA 
bound to the membrane. in this manner, the sequence 
information of twenty vectors can be ascertained from a 
sequencing gel. 

Thus, whereas conventional techniques permitted the 
sequencing of 300 bases per electrophoretic analysis, the 
multiplex DNA sequencing approach permits one to obtain 
sequence information of 6000 bases per analysis. • 

A significant advance in restriction endonuclease 
fragment analysis was recently disclosed by Garcia, E. 
al. (In: Genome Mapping an d Secmenn^nrf ^ Abstracts of 
Meeting Proceedings, Cold Spring Harbor, May 2-6,. i99o, 
page 62) . This reference concerns the use of a yeast 
artificial chromosome vector to clone large DNA sequences 
(such as from the human genome) . To allow direct end- 
labelling of the vectors, the vectors were constructed to 
contain a 34 bp Isjffi fragment. By incubating this vector 
in the presence of Cre and a short labelled oligonucleo- 
tide which contained a ApxP sequence, it was possible to 
label the molecules in vitaro. The use of radioisotopic or 
biotin labels was disclosed. 

Evans fi£_al. have recently described a method which 
is potentially applicable to cloning, ordering clones, and 
the physical mapping of complex genomes (Evans, 6. A. 

Proc. Natl. Acad. Sei. (U.S.A. \ Rfi.rift^n.gn^if (1989), 
herein incorporated by reference) . Unfortunately, Evans 
have elected to refer to this method as "multiplex 
analysis." The term "multiplex analysis" as used herein 
differs significantly from the "multiplex analysis" term 
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used by Evans et al > Thus, although the Evans et al . 
reference uses the sane term as that used by the 
Inventors, It describes a different technique. The use of 
the term by the present inventors is consistent with Its 
5 use by Church et al > , and others In the art. 

In brief, the method described by Evans et al . uses 
a single cosmld library which is constructed by inserting 
random DNA fragments into a site adjacent to a T3 or T7 
bacteriophage promoters. Because these promoter sequences 

10 flank the cloned DNA (Wahl, G.N. et al .> Proc. Natl. Acad. 
Sci, m.S,A,^ M: 2160-2164 (1987)), they can be used as 
probes to detect clones which have overlapping sequences. 

In summary, a method which would minimize the number 
of gels needed for the determination of a particular 

15 sequence would, therefore, be highly desirable. 
Similarly, a method which would facilitate the 
construction of gross and fine restriction maps of a 
target molecule would also be highly desirable. Indeed, 
for the analysis of very large genomes, such as the hman 

20 genome, the development of such methods may be essential. 

SUMMARY OF THE INVENTroy; 

As indicated above, the analysis of a target DNA 
molecule often entails fragmenting the molecule, and 
analyzing and sequencing the resultant fragments. 

25 Especially for large DNA molecules, this is a difficult 
procedure. The present invention relates to an Improved 
method for constructing gross and fine restriction maps of 
a target DNA molecule. 

The invention further relates to an improved method 

30 for determining the nucleotide sequence of a target DNA 
molecule. 

In detail, the invention provides a method for 
einalyzing a target DNA molecule, which comprises: 

(A) forming a recombinant molecule, the recombinsmt 
35 molecule coii^rising a probe/primer sequence linked to a 
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recombinatlonal site (I) , wherein the site Is linked to 
the secpience of the target molecule; 

(B) emalyzlng the target molecule using a nucleic 
acid molecule capable of hybridizing to the probe/primer 
5 sec[uence, or Its complement » 

The Invention also provides the embodiment of the 
above method for determining nucleotide sequence of a 
target DNA molecule wherein the recombinant molecule Is 
formed by: 

10 (1) Introducing the target DNA molecule Into at 

least one vector^ having a recombinatlonal site (II) , to 
thereby form a vector-target DNA construct; 

(2) Incubating the vector-DNA construct in the 
presence of a recomblnase, and a DNA molecule having the 

15 recombinatlonal site (I) and a probe/primer region; 
wherein the incubation Is under conditions sufficient to 
permit the recomblnase to mediate recombination between 
the recombinatlonal site (II) of the vector-target DNA 
construct and the recombinatlonal site (I) of the DNA 

20 molecule; and 

(3) permitting the recomblnase to mediate 
recombination between the recombinatlonal sites, and to 
thereby form the sequencing molecule. 

The invention also provides the embodiments of the 
25 above methods for deteinaining nucleotide sequence of a 
target DNA molecule wherein at least two vector-target DNA 
constructs are formed, and wherein at least two different 
DNA molecules each having a recombinatlonal site (I) and 
further having a different probe/primer region are 
30 employed; and wherein the determining of the sequence of 
the target molecule is through use of two probes, each 
capable of hybridizing to only one of the probe/primer 
regions, or its complement. 

The invention also provides the embodiment of the 
35 above methods for determining nucleotide sequence of a 
target DNA molecule wherein the recombination is site- 
specific recombination, and, in particular, wherein in the 
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site-specific recombination, the reconbinase is Ore, and 
at least one, and preferably both, of the recombinational 
sites (I) or (11) are loxP sites. The invention also 
provides the embodiments of the above laethods wherein the 
5 recombinational site (I) is a loxP site, or a mutant loseP 
site, and wherein the vector contains one wild-type loxP 
site and one mutant loxP site. 

The invention also provides the embodiment of the 
above-described method for analyzing a target DMA 
10 molecule, wherein the analysis comprises ordering 
restriction endonuclease recognition sites in a target ONA 
molecule, and wherein, in step (B) , the analysis comprises 

(i) incubating the recombinant molecule in the 
presence of a restriction endonuclease tmder conditions 

15 sufficient to permit the endonuclease to cleave DNA 
containing a cleavage site recognized by the endonuclease; 
and 

(ii) determining the order of any restriction sites 
in the target molecule using a nucleic acid molecule 

20 capable of hybridizing to a probe/primer sequence, or its 
complement. 

The invention also provides the embodiment of the 
above method for ordering restriction endonuclease 
recognition sites in a target DHA molecule wherein the 
25 recombinant molecule is formed by: 

(1) introducing the t£u:get DNA molecule into at 
least one vector, having a recombinational site (II), to 
thereby form a vector-target DNA construct; 

(2) incubating the vector-DNA construct in the 
30 presence of a recombinase, and a DNA molecule having the 

recombinational site (I) and a probe/primer region; 
wherein the incubation is under conditions sufficient to 
permit the recombinase to mediate recombination between 
the recombinational site (I) of the DNA molecule, and the 
35 recombinational site (II) of the vector-teurget DNA 
construct; 
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(3) permitting the reconbinase to mediate 
recombination between the recombinational sites (I) and 
(II), and to thereby form the recombinant molecule. 

The invention also provides the embodiment of the 
5 above method for ordering restriction endonuclease 
recognition sites in a target DNA molecule wherein at 
least two vector-target DNA constructs are formed, and 
wherein at least two different DNA molecules each having 
a recombinational site (I) and further having a different 
10 probe/primer region are employed; and wherein the ordering 
of restriction sites of the target molecule is through vise 
of two probes, each capable of hybridizing to only one of 
the probe/primer regions, or its complement. 

The invention also provides the embodiments of the 
15 above methods for ordering restriction endonuclease 
recognition sites in a target DNA molecule wherein the 
recombination is site-specific recombination, and, in 
particular, wherein in the site-specific recombination, 
the recombinase is Cre, and at least one, and preferably 
20 both, of the recombinational sites (I) or (ii) are loxP 
sites. The invention also provides the embodiments of the 
above methods for ordering restriction endonuclease 
recognition sites in a target DNA molecule wherein the 
recombinational site (I) is a loxP site, or a mutant loxP 
25 site, and wherein the vector contains one wild-type loxP 
site and one mutant loxy site. 

The invention also provides a kit specially adapted 
to mediate recombination between a DNA molecule having a 
recombinational site (I), and a DNA vector, having a 
30 recombinational site (ll) , the kit comprising in close 
compartmentalization : 

1) a first container containing a recombinase 
capable of mediating the recombination 
between the site (I) of the DNA molecule 
35 and the site (II) of the vector; and 
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2) a second container containing a DNA 
molecule having the recombinational site 
(I)- 

The invention also provides the embodiment of the 
5 above kit wherein the recombinase is Ctb, and wherein at 

least one, and preferably both, of the recombinational 

sites (I) and (II) is a lo3gP site. 

The invention also provides the embodiment of the 

above Icit wherein the kit additionally contains a third 
10 container containing the DNA vector. 

The invention also provides the embodiment of the 

above kits wherein the vector contains one loxP site, or 

wherein the vector contains one wild-type loXP site emd 

one mutant loxP site. 
15 The invention also provides a set of nested 

oligonucleotides each of which has a first region of 

unknown sequence, and a second region of known sec[uence, 

wherein the second region comprises both a recombinational 

site, and a probe/primer region. 
20 The invention also provides the embodiments of the 

above set of nested oligonucleotides wherein at least one 

of the oligonucleotides is hybridized to a probe or to a 

primer. 

The invention also provides the embodiment of the 
25 above set of nested oligonucleotides wherein the 
recombinational site is a loxP site. 

BRIEF DESRIPTIOy OF THE FIG0RE5 

Figure 1 shows the recombination of a circular DNA 
molecule having two loxP sites in direct orientation. 
30 Figure 2 shows the recombination of a circular DNA 

molecule having a single loxP site with a linear loxP- 
containing DNA molecule to produce a linear molecule 

Figure 3 shows the recombination of the loxP sites in 
an inverted repeat orientation. 
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Figure 4 illustrates the exchange of the DNA that 
results from recombination between two lineair molecules. 

Figure 5 shows the use of Ore and loxP sites. 

Figure 6 shows the structure of a vector in which the 
5 recombinational site is illustrated as a loxP site; the 
orientation of the loxP is always left to right relative 
to the other elements shown. 

Figure 7 shows an alternative vector containing two 
non-corresponding recombinational sites. 
10 Figure 8 shows the cloning of a target molecule into 

a vector. 

Figure 9 shows the cloning of a target molecule into 
an alternative vector containing two non-corresponding 
recombinational sites. 
15 Figure 10 shows the structures of loxP -containing 

oligonucleotide having one or more probe primer regions, 
wherein the roman numerals indicate the presence of the 
probe/primer regions which may be different or the same in 
sequence. 

20 Figure 11 shows the linear molecule that is produced 

, by incubating the molecules of Figures 8 and 9 together, 
in the presence of Cre. 

Figure 12 shows the set of nested fragments that is 
obtained when the molecule of Figure 10 is subjected to 
25 partial restriction endonuclease digestion with a 
restriction enzymes, and analyzed by electrophoresis. 

Figure 13 shows the visualization of the nested 
fragments, and how such visualization facilitates 
restriction mapping. 
30 Pigxxre 14 shows the linear molecule that result from 

recombination of the vector through the action of a 
suitable recombinase. 

Figure 15 shows the structures of members of an 
unf ractionated vector library after recombination with one 
35 of a plurality of linear molecules each of which differs 
from the other in the sequence of its probe/primer 
sequence. 
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Figure 16 shows the structxures of J^oyP^ll-containing 
oligonucleotide having one or more probe primer regions, 
wherein the roman numerals indicate the presence of the 
probe/primer regions which may be different or the same in 

5 sequence. 

Figure 17 shows the structure of a linear molecule 
containing more than one adjacent probe/primer regions 
separated by a recombinational site. 

Figures 18A and 18B show the linear molecule that 
10 would be produced using the i22£ / Cre system, after Cre- 
• mediated recombination with a plasmid of the type shown in 
Figure 8 and with an Oligonucleotide I (Primer #n - I03S) 
or an Oligonucleotide II (Primer - Probe #n - I2JS) • 

Figure 19 shows the structure of a cosmid containing 
15 recombinational sites. 

Figure 20 shows a preferred, single-stranded XoxP- 
containing oligonucleotide that possesses a sequence which 
causes it to snap back upon itself. 

Figure 21 shows the result of recombination between 
20 the molecules of Figiire 18 and Figure 19. 

Figure 22 shows the structures of the molecules of an 
array of molecules obtained upon partial restriction 
endonuclease digestion of the molecxiles of Figure 21. 

Figure 23 shows the structure of a class of molecules 
25 having two loxP sites in a direct repeat that result from 
the incubation of the mixture of oligonucleotides of 
Figure 22 with a DNA ligase in the presence of a second 
loxP- containing oligonucleotide, such as shown in Figure 
20. 

30 Figure 24 shows the structure of a class of molecules 

having two loxP sites in an inverted repeat that result 
from the incubation of the mixture of oligonucleotides of 
Figure 22 with a DNA ligase in the presence of a second 
loadP -containinq oligonucleotide, such as shown in Figure 

35 20. 

Figure 25 shows a partially duplex molecule composed 
of certain oligonucleotides. 



wo 92/22650 0 ^ PCr/US92/04923 

-21- 

Figure 26 shows a partially duplex molecule composed 
of certain oligonucleotides. 

PESCRIPTION OF THE PREFERRED EHBODIMEilTfis 

I. RECOHBINATION 

5 The present invention uses the process of recombina- 

tion (Watson, J.D., In; Molecular Blolooy of the Gen^ r 
4th Ed., W.A. Benjamin, Inc., Menlo Park, CA (1987), which 
reference is incorporated herein by reference) . Thus, an 
understanding of the process of recombination is desirable 
10 in order to fully appreciate the present invention. 

Recombination is a well-studied natural process which 
results in the scission of two nucleic acid molecules 
having identical or substantially similar sequences (i.e. 
"homologous"), and the joining of the two molecules such 
15 that one region of each initially present molecule becomes 
joined to a region of the other initially present molecule 
(Sedivy, J.M., Blo-Technol. 1:1192-1196 (1988), which 
reference is incorporated herein by reference) . The 
recombinational reaction is catalyzed by enzymes, globally 
20 referred to as "recombinases. " Such enzymes are naturally 
present in both prokaryotic and eukarydtic cells (Smith, 
G.R., In: Lambda H, (Hendrlx, R. et al ,, Eds.), Cold 
Spring Harbor Press, Cold Spring Harbor, NY, pp. 175-209 
(1983), herein incorporated by reference)). As discussed 
is below, several recombinases are commercially available. 

Two types of recombinational reactions have been 
identified. In the first type of reaction, "general" or 
"homologous" recombination, any two homologous sequences 
can be recognized by the recombinase (i.e. a "general 
(0 recombinase"), and thus act as substrates for the 
reaction. In contrast, the second type of recombination, 
"site-specific" recombination, employs specialized 
recombinases ( i.e. "site-specific recombinases") which 
can recognize only certain defined sequences. Thus, in 
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site-specific recombination, • only molecules having a 
particular sequence may act as substrates for the 
reaction. The significance of each type of recombina- 
tional reaction is discussed below. 

5 A. GENERAL RECOMBINATION 

General recombination is a process by which a 
"region" of DNA can be transferred from one DNA molecule 
to another. As used herein, a "region" of DNA is intended 
to generally refer to any nucleic acid molecule. The 

10 region may be of any length from a single base to a 
substantial fragment of a chromosome » 

For general recombination to occur between two DNA 
molecules, the molecules must possess a "region of 
homology" with respect to one euiother. Such a region of 

15 homology must be at least two base pairs long. Two DNA 
molecules possess such a "region of homology" when one 
contains a region whose sequence is so similar to a region 
in the second molecule that homologous recombination can 
occur. The transfer of a region of DNA may be envisioned 

20 as occurring through a multi-step process. 

If either of the two psarticipant molecules is a 
circular molecule, then the above recombination event 
results in the integration of the circular molecule into 
the other particip€mt. 

25 The frequency of recombination between two DNA 

molecules may be enhanced by treating the introduced DNA 
with agents which stim\xlate recombination. Examples of 
such agents include trimethylpsoralen, UV light, etc. 

The most characterized general recombination system 

30 is that of the bacterium E. coli (Smith, G.R. , In: ^.a^Ma 
II, (Hendrix, R. et al . > Eds.), Cold Spring Harbor Press, 
Cold Spring Harbor, NY, pp., 175-209 (1983)). The E. colj 
system involves the protein, RecA, which in the presence 
of ATP or another energy source, can catalyze the pairing 
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of DKA molecules at regions of homology « The RecA protein 
is commercially available £rom Pharmacia. 

B. SITE-SPECIFIC RECOMBINATION 

The above'-described process of homologous 
5 recombination can occur between any two homologous DNA 
sequences. As indicated above, site-specific 

recombination can occur only between certain highly 
specialized and defined sequences. Site specific 
recombination is mediated between two such sequences 

10 through the action of one or more specialized enzymes. A 
large niuober of such site-specific recombination systems 
have been described. In particular, the PI, Flp, 6in/Fis, 
or X recombinational systems may be employed. For the 
purposes of the present invention, the Pi site-specific 

15 recombinational system is preferred. 

1. The PI Site-Specific Recombination System 

A preferred site specific recombination system is 
that of the E. coll bacteriophage PI. Like bacteriophage 
X, the PI bacteriophage cycles between a (quiescent, 

20 lysogenic state and an active, lytic state. The 
bacteriophage * s site-specific recombination system 
catalyzes the circular ization of PI DNA (approximately 100 
kb) upon its entry into a host cell. It is also involved 
in the resolution of multimeric PI DNA molecules which may 

25 form as a result of replication or homologous 
recombination . 

The PI site-specific recombination system catalyzes 
recombination between specialized sequences, known as 
" loxP " sequences. The loxP site has been shown to consist 

30 of a double-stremded 34 bp sequence. This sequence 
contains two 13 bp inverted repeat sequences which are 
separated from one another by an 8 bp spacer region 
(Hoess, R. , et al . , Proc. Natl. Acad. Sci. fU.S.A.^ 
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79:3398-3402 (1982); Sauer/ B.L., U.S. Patent No. 
4,959/317, herein incorporated by reference). 

The recoBibination is mediated by a Pl-encoded protein 
known as "Cre*^ (Hamilton, D.L., et al. , J. ffo^. ^jol, 
5 178 ; 481-486 (1984), herein incorporated by reference). 
The Cre protein mediates recombination between two loxP 
sequences (Sternberg, N., et al >, Cq14 Sprjpg paybQp gymp. 
Quant, Biol. 4^:297-309 (1981)). These sequences may be 
present on the same DMA molecule, or they may be present 

10 on different molecules. Cre protein has a molecular 
weight of 38,000. The protein has been purified to 
homogeneity, and its reaction with the loacP site has been 
extensively characterized (AbremsJci, K*, et al,., O't MqI. 
Biol. 25a: 1509-1514 (1984), herein incorporated by 

15 reference) . The cre gene (which encodes the Cre protein) 
has been cloned (Abremski, K. , et al . , Cell 31:1301-1311 
(1983) , herein incorporated by reference) . Cre protein 
can be obtained commercially from New England 
Nuclear /Dupont • 

20 The site specific recombination catalyzed by the 

action of Cre protein on two loxP sites is dependent only 
upon the presence of the above-described thirty-four base 
pair loxP site and Cre. Magnesium ions or spermidine are 
needed for efficient recombination. Energy, however, is 

25 not required for this reaction; thus, there is no 
requirement for ATP or other similar high energy 
molecules. No proteins other thcui Cre are required in 
order to mediate site specific recombination at lo>{g sites 
(Abremski, K. , et al . . J, Mol, Biol. 25^:1509-1514 

30 (1984)). In vitro, the reaction is highly efficient; Cre 
is able to convert up to about 70% of the DNA sxibstrate 
into products and it appears to act in a stoichiometric 
manner. The extent of reaction reflects an equilibrium 
among the various molecules containing loxP sites. 

35^ Cre-mediated recombination can occur between loxP 

sites which are both present on the seoae molecule, or 
which are present on two different molecules. Because the 
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internal spacer sequence of th& loxP site is asymaetr ical , 
two loxP sites can exhibit directionality relative to one 
another (Hoess, R.H., Proc. Nat i> Acad, scl. 

fU,S.A,^ 11:1026-1029 (1984) ) • When two sites on the same 
5 DNA molecule are in a directly repeated orientation (i.e. 
' ' ^ )/ Cre will excise the DNA between the sites 

(Abremski, K., fit ai-. Cell 12:1301-1311 (1983)). 

However, if the sites are inverted with respect to each 
other (i.e. - q e ), the DNA between them is not 

10 excised after recombination but is simply inverted. Thus, 
a circular DNA molecule having two loxP sites in direct 
orientation will recombine to produce two smaller circles, 
whereas circular molecules having two loxP sites in an 
inverted orientation simply invert the DNA sequences 

15 flanked by the loxP sites (Figure 1) . 

Two circular molecules each having a single IbxP site 
will recombine to form a mixtxire of monomer, dimer, 
trimer, etc. circles. Higher concentrations of circles 
favor higher n-mers; lower concentrations of circles favor 

20 monomers. 

A circular DNA molecule having a single loxP site 
will recombine with a linear JLojcE-containing DNA molecule 
to produce a linear molecule (Figure 2) . 

As indicated above, a linear molecule with direct 

25 repeats of loxP sites interacts to produce a circle 
(containing the sequences between the loxP sites), and a 
linear molecule. However, if the loxP sites are inverted 
repeats, recombination flips the sequence between the loxP 
sites back and forth (Figure 3). 

30 When the starting DNA substrate is supercoiled, the 

final reaction product is also supercoiled (Abremski, K., 
et ^1., Cell 32:1301-1311 (1983); (Abremski, K., et al >, 
J. Biol, Chem. 2fil: 391-396 (1986)). The recombinational 
event does not, however, require supercoiling, and works 

35 with equal efficiency on supercoiled or linear molecules 
(Abremski, K., et al . . fieU 32:1301-1311 (1983); Abremski, 
K«i et al .. J. Biol. Chem> 261 :391-396 (1986)). 
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The nature of the interaction between Cre and a loxP 
site has been extensively studied (Hoess, et al. , 

mid snrq. Harb, Svigp, Quant. Piol> 4a:761-768 (1984), 
herein incorporated by reference) . In particular, 
5 mutations have been produced both in Cre, and in the JtOXR 
site. 

The Cre mutants thus far identified have been found 
to catalyze recombination at a slower rate them that of 
the wild-type Cre protein. lo^ mutants have been 

10 identified which recombine at lower efficiency than the 

wild-type site (Abremski, K., et al., J. BipXt Chem. 

2611391-396 (1986); Abremski, K., et a^w J, Mol, Biol- 
202 :59-66 (1988), herein incorporated by reference). 

Of particular interest to the present invention is 

15 the 103CP511 mutant site. The sequence of lox?5:|.i> is 
described by Hoess, R.H. et al. (Nucleic Acidp Res. 
14:2287-2300 (1986), herein incorporated by reference). 
Cre can mediate efficient recombination between two loxP 
sites, or between two lo3gP5ll sites; it is, however, 

20 substantially incapable of mediating efficient 
recombination between a loxP site and a loxPSll site. 

The Cre protein is capable of mediating lagg-specif ic 
recombination in Saccharomyces cerevisiae (Sauer, B., 
Molec, Cell. Biol. 7:2087-2096 (1987); Sauer. B.L., U.S. 

25 Patent No. 4,959,317, herein incorporated by reference). 
Such a property indicates that the Cre protein is capable 
of accessing DNA in eukaryotic cells even though such DNA 
is typically organized into nucleosomes within the 
nucleus, and bound to histones and other proteins. 

30 Significantly, the loxP- Cre system can mediate site- 

specific recombination between loxP sites separated by 
extremely large numbers of nucleotides (Sternberg, N. 
f Proc. Natl. Acad. Sci, (U.S.A.) 87:103-107 (1990), herein 
incorporated by reference) . Indeed, the ability of Cre to 

35 circularize the bacteriophage PI evidences its ability to 
mediate the recombination of large DNA molecules. 
Recombination has been demonstrated to occur between two 
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loxP sites present on the 15*0 kb genome of the pseudo- 
rabies vims (Sauer, B., et al . . Gene 2fi:331-341 (1988), 
herein incorporated by reference) . 

It has been found that certain E. coli enzymes 
5 inhibit efficient circularization of linear molecules 
which contain loxP sites at their termini. Hence, 
enhanced circularization efficiency can be obtained 
through the use of E. coll mutants which lack exonuclease 
V activity (Sauer, B., et al > , Gene 70 x331-341 (1988)). 

10 Ore has been able to mediate loacP specific 

recombination in mammalian cells (Sauer, B., et al , ^ Proc* 
yatl, Acacj, gcjp (V,S,h,) 85:5166-5170 (1988), Sauer, B., 
e£_jLl., Nucleic Acids Res. 17:147-161 (1989), both 
references herein incorporated by reference.) Similarly, 

15 the recombination system has been capable of catalyzing 
recombination in plant cells (Dale, E.G., et al . ^ Gene 
31:79-85 (1990)). 

2. The Flp Recombination System 

Yeast express a recombinase known as "Flp** which 

20 catalyzes the site-specific inversion of a region of the 
yeast 2-/i circle plasmid (Schwartz, C.J. et al . . J. Molec, 
Biol, 205 :647-658 (1989); Parsons, R.L. et al . . J. Biol. 
QiSBL. 265:4527-4533 (1990); Golic, K.G. et al . , figH 
5^:499-509 (1989); Amin, A.A. et ^1,., J. Molec. Biol. 

25 214 :55-72 (1990)). The flp gene has been cloned, and the 
site (" FRT ") which is recognized by the Flp recombinase 
has been determined (Vetter, D., et al., Proc. Nat l. Acad. 
Sci. fU.s,A.^ afi:7284-7288 (1983), herein incorporated by 
reference) • The organization of the FRT sequence is 

30 similar to that of the loxP sequence recognized by Cre; 
however, the sec[uence contains a nearly perfect inverted 
repeat and a direct repeat* Flp-mediated recombination 
optimally occurs in vitro at a pH of between 6.6 and 8.0. 
A divalent cation such as Hg^ or spermidine is required. 

35 The Flp protein can recombine 2-;x derivatives having 
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directly repeated ESS. sites to produce two circular 
plasmids. It does not require either host factors or 
supei;coiled substrates (Vetter, D., et al.. pyoc> ^^tl? 
Acad. Sci. fU,S,A,l 80:7284-7288 (1983) )• 

5 3. The Gin/Fis Recoinblnation System 

The E. coli bacteriophage Mu has been found to encode 
a protein (Gin) which can mediate recombination at 
specific sites in the Hu genome (Hertens^ 6. et al . , EMBO 
J\ 1:2415-2421 (1984); Mertens, G, et al., Bjol. Che^. 

10 261 :15668-15672 (1986)). The reaction causes a site- 
specific inversion of the Mu G segment and results in an 
altered host range. Recombination occurs at 34 base pair 
long sites, which must be arranged as inverted repeats on 
supercoiled DHA. 

15 In order for inversion to occur , the phage encoded 

gin gene must be expressed. The product of this gene 
(Gin) binds to sites in each of the inverted repeats in a 
cooperative manner, induces a two base pair staggered 
nick, and foanas a covalent linkage with DNA at the 5' end 

20 of each nick. In the presence of Gin alone, DMA inversion 
occurs with low frequency both in vitro and in vivo . In 
order to stimulate inversion, an E. coli host factor, 
known as "Fis" is typically required, unless a Fis- 
independent Gin mutant (i.e. a protein capable of 

25 catalyzing recombination in a site-specific manner without 
host factor) is employed. Such a mutant is disclosed by 
Klippel, A. et al . f EMBO J. 7;3983-3989 (1988)). 

4. The X Hecombination System 

The site-specific recombination system of the E. coli 
30 bacteriophage X has been well chauracterized (Weisberg, R. 
et al . , In: Laibda^II, (Hendrix, R. Eds.), Cold 

Spring Harbor Press, Cold Spring Harbor, NY, pp. 211-250 
(1983), herein incorporated by reference. Bacteriophage 
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X vises this reconblnatlonal systest in order to integrate 
its genome into that of its host, the bacterixia E. eoli . 
The system is also employed to excise the bacteriophage 
from the host genome in preparation for virus' lytic 
5 growth. 

The recombination system is coi^osed of four 
proteins- Int and Xis, which are encoded by the virus, and 
two host factors encoded by the E. eoli . These proteins 
catalyze site-specific recombination between "Att" sites. 
0 The X Int protein (together with the E. coll host 

integration factors) will catalyze recombination between 
"AttP" and "AttB" sites. If the AttP sequence is present 
on a circular molecule,, and the AttB site is present on a 
linear molecule, the result of the recombination is the 
5 disruption of both Att sites, and the insertion of the 
entire AttP-containing molecule into the AttB site of the 
second molecule. The newly formed linear molecule will 
contain an AttL and an AttR site at the termini of the 
inserted molecule. 
) Even in the presence of host factors, the X Int 

enzyme, by itself, is unable to catalyze the excision of 
the inserted molecule. Thus, the reaction is 
unidirectional. If a second X protein, the X Xis protein, 
is added to the reaction, the reverse reaction can 
» proceed, and a site-specific recombinational event will 
occur between the AttR and AttL sites to regenerate the 
initial molecules. 

The nucleotide sequence of both the Int and xis 
proteins are known, and both proteins have been piirified 
• to homogeneity. Both the integration and the excision 
reaction can be conducted in vitro . The nucleotide 
sequences of the four Att sites has also been determined 
(Weisberg, R, sLAL- , In: Lambda ii. (Hendrix, R. et al . , 
Eds.), Cold Spring Harbor Press, Cold Spring Harbor, NY, 
pp. 211-250 (1983), which reference has been herein 
incorporated by reference) . 
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5. Other Site-Specific Recombination Systems 

Any of a large number of additionea site-specific 
recombination systems can be used in accordance with the 
methods of the present invention. 
5 such systems are discussed by Echols, H. (S^Siali. 

14697-14700 (1990)), de Villartay, J.P. mt^^ 
335:170-174 (1988); Craig, N.L. if^mt R^Yi gg^et, 1^:77- 
105 (1988)), Poyart-Salmeron, C. efe_al. (pmo J. &;2425- 
2433 (1989)), Hunger-Bertling, K. et a?- (IfoJ^eCt CeJ-J.^ 

10 Blochem. 92; 107-116 (1990)), and Cr egg, J.M. ( MPl^Q* Gbu^ 
csenet. 320-323 (1989)), all herein incorporated by 
reference. Examples of preferred additional recombination 
systems include: the Tpnl and the |5-lactamase transposon 
systems (Levesque, R.C., Bacterjoj, 121:3745-3757 

15 (1990)); the Tn3 resolvase system (Flanagan, P.H.'s^^. , 
.T^ Wftlee. Biol. 206 :295-304 (1989); Stark, W.M. St-Sl. # 
Cell gg.; 779-790 (1989)); the yeast recombinase systems 
(Matsuzaki, H. et al . . -T. Baeteriol. 172:610-618 (1990)); 
the p. subtilis SpoIVC recombinase systiem (Sato, T. ^ 

20 al.. -T. Baeteriol. m: 1092-1098 (1990)); the Hin 
recombinase system (Glasgow, A.C. et al., J. Biol. Che^n. 
264 :10072-10082 (1989)); the immunogolobulin recombinase 
systems (Malynn, B.A. et al . . Cell SA: 453-460 (1988)); the 
Cin recombinase system (Hafter, P. sfe_al.# HffiSLi. 2: 3991- 

25 3996 (1988); Hubner, P. et al.. jTt nol^Cf ^jol^ 205^493- 
500 (1989) ) ; the Pin recombinase system (Plasterk, R.H.A. 
et al .. rnid Spring Warbor Svpm. Quant Biol. 42:295-300 
(1984); all of the above references are herein 
incorporated by reference. 

30 II. RECOMBINATION-FACILITATED SEQUENCE ANALYSIS 

AS indicated above, the multiplex sequencing method 
of 6.M. Church et al . (SsiSSSS 2ifl:185-188 (1988)) 
requires the construction of a large number of vector 
libraries. In contrast, the present invention achieves 
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the goal of multiplex sequencing without the need to 
construct multiple gene libraries. In accordamce with the 
present invention, DNA or cDNA from any desired source is 
obtained, cuid cloned into a cloning site of any of the 
5 well-known prokaryotic, eukaryotic, or shuttle vectors 
vectors, modified to contain a recombinational site. 
Examples of suitable vectors are provided by Maniatis, T», 
Qt Rlt (In: Molecular Cloning. A Laborat ory Manual - Cold 
Spring Harbor Press, Cold Spring Harbor, NY (1982)). The 

10 sequence information is then obtained through the use of 
a novel method. 

Of particular importance to the present invention is 
the fact that recombination between two linear molecules 
results in the exchange of their DNA (Figure 4). 

15 Thus, if in Figure 4, the sequence 1-7 represented a 

target molecule of unknown sequence, and the sequence H I 
contained a detectable marker, the result of the 
recombination would be to link the detectable marker to 
the target molecule. 

20 Any of the above-described recombinases and their 

corresponding recombinational sites may be employed. As 
used herein, a "recombinational site" is a region of a 
DNA molecule of a sequence and size sufficient to permit 
it to function as a substrate in a recombinational 

25 reaction when provided with a suitable recombinase, and a 
second DNA molecule having a suitable recombinational 
site. Where the recombinase is a general recombinase, the 
recombinational site can be of any size or sequence. More 
preferably, however, the recombinational sites are 

30 selected so as to be capable of serving as a substrate in 
a recombinational reaction catalyzed by a site-specific 
recombinase, preferably Cre. For example, where the 
recombinational sites are loxP sites, the recombinase 
would be Cre; where the recombinational sites are attP and 

35 attB sites, the recombinase would be Int. Any other 
combination of recombinase and recombinational site may be 
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used. An example of the use* of Cre and loxP sites is 
shown in Figure 5. 

Recombination between a DNA molecule containing a 
loxP site and a double-streuided oligonucleotide containing 
5 a loxP site thus causes a "cut" and religation at the loxP 
sites. By using high molar ratios of such oligonucleo- 
tides to target molecules, the reactions may be driven 
towcurd completion. 

The present invention exploits this capacity through 

10 the use of oligonucleotides which contain loxP sites, 
such oligonucleotides may be of any length, however^ it is 
preferable to employ oligonucleotides of 20-50 (and 
preferably about 34) base pairs. Such an oligonucleotide 
will contain a recombinational site, which may be at a 

15 terminus of the molecule, or may be flanked by other bases 
of the oligonucleotide. 

nhere one desires to use a vector to sequence a 
tcurget molecule, the loxP site of the vector is preferably 
located as close as possible to the target sequence (thus 

20 minimizing the size of the vector sequence which would 
need to secpienced in order to complete the desired 
sequencing of the target) • 

nhere one desires to map the restriction sites of a 
target molecule, the loxP site is preferably located about 

25 500 bases away from the target sequence (if a plasmid is 
employed) or 1000-2000 bases away from the target sequence 
(if a cosmid is employed) • 

As indicated above, the vectors of the present 
invention contain at least one "recombinational site." 

30 The loxP site is the preferred recombinational site of the 
present invention. In the most preferred embodiment, the . 
vector shall contain one loxP site. The recombinational 
site will be incorporated into the vector at a location 
near the location of the cloning region. The structure of 

35 the vector is thus depicted in Figure 6 (where the 
recoiabinational site is illustrated as a loXP site; the 
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orlentation of the loxP is always left to right relative 
to the other elemente shown) . 

In an alternative prefeixed embodiment, the vector 
shall contain two recombinational sites, which shall flank 
5 the cloning region. Most desirably, the two 
recombinational sites shall differ from one another such 
that it is possible to mediate recombination at each 
recombinational site without mediating recombination at 
the other. This can be accomplished, for example through 

10 the use of vectors having a wild type and a mutant sites 
(such as a wild type and a mutant loxP site) , or through 
the use of a vector having non-corresponding sites (such 
as a loxP site and an attP site, etc.). The orientation 
of the regions are such that the following structure is 

15 formed (illustrated using loxP and the loxP mutant site, 
loxPSii^ (Figure 7). 

Having now generally described the invention, the 
same will be more readily understood through reference to 
the following examples which are provided by way of 

20 illustration, and are not intended to be limiting of the 
present invention, unless specified. 

EXAMPLE 1 
RESTRICTION FRAGMENT ANALYSIS 

In accordance with this aspect of the present 
25 invention, a DNA molecule whose sequence is to be mapped 
by restriction endonuclease digestion is obtained from a 
suitable source. Preferably, sucdi target DNA has been 
isolated using a restriction endonuclease which is also 
capable of cleaving at a site within the cloning region of 
JO the above-described vectors. Alternatively, conventional 
methods can be used to adapt the ends of the target such 
that they are now capable of being ligated into a 
restriction site of the cloning region. This can, for 
example, be accomplished with any target by treating 
15 overhanging ends to produce a blunt ended target molecule. 
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Once the ends of the target molecule have been so 
prepared, one of the above-described vector molecules is 
cleaved with a restriction endonuclease capable of 
cleaving the vector within the cloning region, and forming 
5 termini which are capable of ligating with the target 
molecule. 

The target molecule is then introduced into the 
vector, and the vector is recircularized through the 
action of a DNA ligase. Procedures for accomplishing 

10 these steps are disclosed by Maniatis, T., et aj... (In: 
Molecular cloning. A iiaborator v Manual. Cold Spring Harbor 
Press, Cold Spring Harbor, NY (1982)). 

The use of a vector having two different recombina- 
tional sites facilitates the analysis of restriction sites 

15 at both ends of the target molecule* For the purposes of 
illustrating the invention, however, a vector containing 
a single recombinational site is depicted. The insertion 
of the target sequence is shown in Figure 8 and Figure 9. 
This vector is then incubated in the presence of Cre and 

20 a loxP- containing oligonucleotide having one or more probe 
primer regions (Figure 10) . The resulting recombination 
creates a linear molecule, shown in Figure 11. 

When isuch a molecule is subjected to partial 
restriction endonuclease digestion with a restriction 

25 enzymes, and analyzed by electrophoresis, a set of nested 
fragments containing the probe/primer region is obtained 
(Figure 12) • 

significantly, since these nested fragments contain 
the probe/primer region of the original molecule, a probe 

30 having a sequence substantially complementary to that of 
the probe/primer region will be able to hybridize to the 
fragments. By labelling such a probe, it is thus possible 
to visualize the nested fragments which hybridize to the 
probe/primer region. Moreover, because the molecules 

35 share one end in common, the position of the restriction 
sites can be readily determined by measuring the sizes of 
the "bands," as shown in Figure 13. 
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It is possible to prepare multiple sets of nested 
fragments using different restriction endonucleases, and 
loxP- containinq oligonucleotides having different 
probe/primer regions* Each set of nested fragments can be 
5 visualized by incubating the total set of fragments with 
a probe substantially complementary to the probe/primer 
region of the respective set of nested fragments. Thus, 
through the use of multiple sets of different probes (each 
capable of hybridizing to a different probe/primer 

10 region) , the present invention permits one to sequentially 
analyze all of these nested sets of fragments. Indeed, if 
different labels are employed on different probes, it is 
possible to simtiltaneously analyze different sets of 
nested fragments. 

15 By conducting the analysis using sets of loxP- 

containing oligonucleotides which contain two probe/primer 
regions, one common to all of the oligonucleotides, and 
one that is varied to permit the individual visualization 
of a single set of nested fragments, it is possible to 

20 visualize all of the sets of nested fragments. 

Since such analyses may be conducted from a single 
gel, the present invention greatly facilitates the process 
of restriction mapping. 

EXAMPLE 2 

25 DNA SEQUENCE ANALYSIS 

In this aspect of the present invention, the sequence 
of a cloned region can be determined in a multiplex 
analysis. 

The method utilizes two types of DNA molecules. The 
30 first molecule is a cloning vector which contains a loxP 
site. Any of the well-known prokaryotic, eukaryotic, or 
shuttle vectors vectors may be modified to permit their 
use in the present invention. The vector shall contain at 
least one recombinational site, preferably loxP , which 
35 precedes and is adjacent to and, most preferably, 
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immediately adjacent to a cloning region. The structure 
of the vector is as shown in Figure 6 or Figure 7. 

In accordance with the sequencing method of the 
present invention, a DNA molecule whose sequence is to be 
5 determined (i.e* a "target" sequence) is obtained from a 
suitable source. Preferably, such target DNA has been 
isolated using a restriction endonuclease which is also 
capable of cleaving at a site within the cloning region of 
the above-described vector. Alternatively/ conventional 

10 methods can be used to adapt the ends of the target such 
that they are now capable of being ligated into a 
restriction site of the cloning region. This can, for 
example, be accomplished with any target 1^ treating 
overhanging ends to produce a blunt ended target molecule. 

15 Once the ends of the target molecule have been so 

prepared, one of the above-described vector molecules is 
cleaved with a restriction endonuclease capable of 
cleaving the vector within the cloning region, and forming 
termini which are capable of ligating with the target 

20 molecule. 

The target molecule is then introduced into the 
vector, and the vector is recircularized through the 
action of a DNA ligase. Procedures for accomplishing 
these steps are disclosed by Haniatis, T., et al. (In: 

25 Molecular Cloning > A Laboratorv Manual s Cold Spring Harbor 
Press, Cold Spring Harbor^ NY (1982)). The construction 
is as shown in Figure 8 and Figure 9. 

Once the target DNA has been inserted into any of the 
above-described vectors, it cam then be amplified, by 

30 propagating the vector in a suitable host. individual 
members of the library (either as transformed cells, or 
isolated DNA) can then be isolated. 

in order to accomplish multiplex sequencing of the 
vector, one permits the vector to undergo site-specific 

35 recombination with a linear DNA molecule having a 
recombinational site, and at least one probe/primer 
region, located near the recombinational site (Figure 10) . 
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Preferably, the linear molecule will have probe/primer 
regions on both sides of the reconbinational site. 

In the presence of a suitable reconbinase (such as 
Cre) , the vector and the linear molecule recombine to form 
5 a linear molecule as shown in Figxire 14. 

If the sequencing reaction is done using the Haxam- 
Gilbert method, then a probe capable of hybridizing to 
probe/primer I will identify one set of nested sequencing 
reaction products. 
10 Alternatively, the target can be sequenced using the 

Sanger method by employing a primer having a 3 'OH termini 
which is capable of hybridizing to the probe/primer 
sequence of the fragment. Extension of this primer 
creates a set of nested sequencing reaction products. 
15 Significamtly, a probe/primer is capable of 

identifying only that nest of reaction products which has 
a probe/primer sequence to vfliich it can hybridize. Thus, 
the. use of a probe capable of hybridizing to a different 
probe/primer sequence will identify a different set of 
20 nested sequencing reaction products. 

This feature of the present invention permits a 
multiplex analysis to be performed. To accomplish this, 
members of the unfractionated vector library are 
separately permitted to recombine with one of a plurality 
25 of linear molecules each of which differs from the other 
in the sequence of its probe/primer sequence. The result 
of such recombination may be depicted asshown in Figure 
15. 

Since a probe/primer capable of hybridizing to one 
30 probe/primer sequence (for example probe/primer sequence 
I in Figure 15) will identify only that set of nested 
sequence reaction products which contains the probe/primer 
sequence, all of the sequence reactions may be combined 
and analyzed on the same sequencing gel, by sequentially 
55 hybridizing with a different probe/primer sequence. Thus, 
where probe/primers capable of hybridizing to probe/primer 
sequences I, II, III, and IV are used, a single sequencing 
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gel can be used to deteimdhe the sequence of target 
molecules A, B, C and D. 

Use of a vector having a second recomblnatlonal site 
(such as loxPSll ) adjacent to the second end of the 
5 inserted target molecule, in conjunction with a linear 
molecule (such . as shown in Figure 16) permits one to 
identify a set of nested sequencing reaction products of 
the other strand of the target molecule. Thus, such a 
probe would permit the sequencing of the second strand of 

10 a DNA molecule (or equivalently, would yield sequence 
information relevant to the 3' end of the sequence of 
depicted target molecule A) • 

Significantly, the invention thus permits the 
seqpxencing of both strands of a DNA molecule on the same 

15 sequencing gel. 

As indicated above, in one embodiment, the lineEu: 
molecule will contain more than one adjacent probe/primer 
regions sep2u:ated by a recombinational site (Figure 17) . 
If one employs a set of such linear molecules in which 

20 probe/primer Z or W are kept invariant, whereas the 
sequence of probe/primer sequence I or II is varied, then 
one has the capacity to perform either a multiplex 
sequence analysis using probe/primers capable of 
hybridizing to the sequence of probe/primer sequence I or 

25 II or their variants or non-multiplex sequence analysis 
(using probe/primers capable of hybridizing to the 
sequence of probe/primer sequence Z or W) . 

EXAMPLE 3 

DETAILED DESCRIPTION OF MULTIPLEX SEQUENCE ANALYSIS 

30 To perform multiplex sequence analysis, a series of 

oligonucleotides are constructed. These oligonucleotides 
will have a probe/primer region, which may have either of 
two general structures (depicted using loxP as the 
recombinational site) : Oligonucleotide I (Primer #n -loxP ) 

35 or Oligonucleotide II (Primer - Probe #n - loxP ) where the 
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n indicates the number of different oligonucleotides in 
the series. 

The above-described vectors are Incubated in the 
presence of a recombinase and eacdi of the oligonucleotides 
5 of the series, in separate reactions. To illustrate this 
aspect of the invention, using the loxP / cre system, 
after Cre-mediated recombination with a plasmld of the 
type shown in Figure 8 and with an Oligonucleotide I 
(Primer #n - loy?) , the linear molecule shown in Figure 
10 18A would be produced. 

After Cre^mediated recombination with a plasmld of 
the type shown in Figure 8 with an Oligonucleotide II 
(Primer - Probe #n - IgaS) the linear molecule shown in 
Figure 18B would be produced. 
15 As will be appreciated, either of such molecules can 

be employed to determine the sequence of the target 
molecule using either the Sanger or Maxam-Gllbert methods. 
Where oligonucleotides of class I are employed, n 
different primers would be added to each sequencing 
20 reaction, and the probes to detect the sequencing products 
would be the complements of the primers. since such 
different probes are being used, it is possible to analyze 
all n sequence reactions using a sequencing gel, through 
a multiplex sequence analysis. Where oligonucleotides of 
!5 class II are employed, a single primer would be used in 
the sequencing reactions, and different probes (i, 2, 
etc.) would be used. This latter embodiment is preferred, 
except that target sequence would not be reached until 
about 77 nucleotides from the 5" end of the primer (i.e. 
0 20 nucleotides of the primer, 20 nucleotides of the probe, 
34 nucleotides of the loxP site, and 3 nucleotides from 
the remainder of the cloning site (e.g. Smal) . When one 
wishes to eliminate the need to sequence 20 of these 
nucleotides, one would include a deoxyuracll (dU) toward 
5 the 3* end of the primer, and treat with the enzyme DDG 
(Uracil DNA Glycosylase) just before running the 
sequencing gel. This treatment renders the sites abaslc. 
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but does not cleave the phophodiester backbone of the DNA 
molecule. Cleavage may be accomplished by heating the 
reaction, or by incubating it in the presence of an enzyme 

(such as endonuclease IV of Ei sail) capable of 

5 specifically cleaving nucleic acid molecules at abasic 
sites. Note that after Cre-lo2ffi recombination, the 
priming sites would be completely single stranded, even 
without denaturation, since the recombination 
oligonucleotide would be single stranded in the primer 
10 domain. 

The above method permits one to sequence from one 
side of a target molecule. The use of a vector containing 
two recombinational sites permits one to sequence from 
both sides of the molecule. 
15 Table 1 compares the eibility of the vectors and 

methods of the present invention, with those* of the 
multiplex method, to facilitate multiplex sequence 
analysis of ten sequencing reactions with a target DNA 
molecule. 
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II 
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Nuxober of Vectors 
Required 
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5 
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Required 
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1 
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10 


Number of Becombination 

Oligonucleotides 

Required 


10 


10 


0 




Number of Primers 
Required 


10 
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15 


Approximate Number of 
Nucleotides from 5* End 
of Primer to Target 


57 


77 


43 




Number of Probe 

Oligonucleotides 

Required 


10 


10 


10 


20 


Recombinase Required 
(Cre) 


Yes 


Yes 


No 



The present invention includes articles of 
manufacture, such as "kits." In one embodiment, such 
kits will, typically, be specially adapted to contain in 
close compartmentalization a first container which 
contains a DNA vector, which has at least one 
recombinational site (I); a second container which 
contains at least one probe/primer DNA molecule having a 
recombinational site (II) , and a probe/primer region; and 
a recombinase capable of mediating recombination between 
site (I) of the DNA vector and site (ii) of the 
probe/primer DNA molecule. The kit may additionally 
contain multiple probe/primer DNA molecules, which may be 
used to facilitate the multiplex sequence analysis of DMA 
in accordance with the methods of the invention. The kit 
may additionally contain instructional brochures, and the 
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like. It may also contain reagents sufficient to 
accoB^lish DNA sequencing. 

In a second embodiment, such kits will, typically, be 
specially adapted to contain in close compairtiBentaliza- 
5 tion a first container which contains a DNA molecule (such 
as a linear oligonucleotide or a vector) which has at 
least one recombinational site (I); a second container 
which contains at least one DHA molecule having a 
recombinational site (II) , and is detectably labelled; and 

10 a recombinase capable of mediating recombination between 
the recombinational site (I) of the DNA molecule and site 
(II) of the labelled oligonucleotide. The kit may 
additionally contain instructional brochures, and the 
like. It may also contain reagents sufficient to 

15 accomplish DNA sequencing* 

EXAHFLE 4 
SEQUENCING OF A COSMID MOLECULE 

The present invention facilitates the sequencing of 
cosmid molecules. In this method, a cosmid is constructed 

20 so as to contain a IssS. site (Figure 19) . The molecule is 
incubated in the presence of Cre and a 2^-containing 
oligonucleotide, preferably, the oligonucleotide is 
single-stranded, and will possess a sequence which causes 
it to snap back upon itself (Figure 20) . As a result of 

25 such incubation, a linear molecule will be produced having 
the structure shown in Figure 21. Upon restriction 
endonuclease digestion, an array of partial-digestion 
products such as those shown in Figure 22 are obtained. 
As will be recognized, the effect of the reaction has 

30 been to produce a series of oligonucleotides which contain 
at most, only one loxP site. This mixture of 
oligonucleotides is then incubated with a DNA ligase in 
the presence of a second lasS-containing oligonucleotide, 
which will preferably be single-stranded, and possess a 

35 sequence which causes it to snap back upon itself, such as 
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shovn In Flgxire 20 « As a result of such Incubation, three 
general classes of molecules will be present in the 
reaction: 

(I) those with only one lo3CP site, 
5 (II) those having two loxP sites in a direct repeat 

(Figure 23), and 
(III) those having two loxP sites in an inverted 
repeat (Figure 24). 
As will be perceived, only molecules which contain 
10 the target sequences that were initially bound to the loxP 
site of the first DNA molecule (i.e. A-B sequences) will 
contain two directly repeated loxP sites (i.e. class II 
molecules) . Such molecules thus contain target DNA rather 
than cosmid vector DNA. 
15 The mixture of molecules will then preferably be 

separated, as with agarose gel electrophoresis, or other 
conventional means, and the different sizes of molecules 
eluted or otherwise recovered. 

The directly repeat loXP sites present on these 
20 molecule pe3cmits one, in a Cre-*mediated reaction, to 
recombine the cloned DNA between these sites into any of 
the loxP -containina vectors discussed above. 

Thus, this method permits one to subclone target DNA 
from a cosmid into a smaller vector. Significantly, the 
25 cloned DNA is memipulated such that it becomes flanked 
with directly repeating loxP sites. Moreover, the method 
peznnits one to obtain and clone a set of nested 
oligonucleotide fragments of a desired target molecule. 

EXAMPLE 5 

30 MULTIPLEX RESTRICTION FRAGMENT ANALYSIS 

^ The use of Cre/ loxP mediated site-specific 

recombination as a method to facilitate multiplex mapping 
was demonstrated by the following procedure. The target 
molecules were pLox, a 2.9 kb plasmid with a loxP site 
35 cloned into a polylinker region, and pSPORT-lox, a 4.1 kb 
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10 



15 



20 



25 



30 



35 



plasmid with a IssS site inserted into its multiple 
cloning site (MCS) . 

The sequences chosen for hybridization probes were 
taken from Church sk^. (Ssisssa 242:185-188 (1988)) and 
the hybridization, washing, and probe stripping procedures 
disclosed therein were used with minor modification, 
specifically, the hybridization probes used were (in the 
Church At al . nomenclature) POl, P02, P03 and P04. 

Recombinant molecules to be eventually used as 
substrates for multiplex mapping were generated as 
follows. A partially duplex molecule which was composed 
of the oligonucleotides, as shown in Figure 25, was 
incubated with pSPOBT-lox in the presence of Cre under 
conditions sufficient to permit recombination to occur. 
Specifically, the reaction contained 1 pmol of plasmid, 
4 pmol of oligonucleotides and 5 units of Cre (NiW) in 
buffer composed of 50 mM Tris-HCl (pH 7.5), 33 mM NaCl, 5 
mil spermidine, 0.5 mg/ml bovine serum albumin (BSA)," 
incubations were at 37«C for 15 minutes. A separate 
reaction containing pLox and a second partially duplex 
oligonucleotide of the structure shown in Figure 26 was 
also incubated in the presence of Cre such that 
recombination took place. After inactivation of the Cre 
by heating to 65 'C for 10 minutes, portions of these 
recombination reactions were either kept separate or mixed 
together and then subjected to partial digestion with the 
restriction endonuclease Haelll or Hhal. The products 
were resolved on an agarose gel. After electrophoresis, 
an overnight alkaline transfer to a charged nylon membrane 
(BioDyne-B) was performed (Reed and Mann, W^cl^j-c ftcids 
geas. 13:7207-7221 (1985)). 

Pre-hybridizations and hybridizations were performed 
in the buffers of Church et al. (SsigQSS 24fl: 185-188 
(1988)), however, incubations were carried out at 37 »C 
rather than 42 »C and hybridizations were extended 
overnight, oligonucleotide probe (i.e., POl, P02, P03 and 
P04 [of Church et al . . SsiSlSS 24fi:185-188 (1988))] were 
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labeled with T4 polynucleotide kinase and [t^^P] ATP» All 
washes were performed at room temperatiire and consisted of 
two washes with 6xSSC for 2 minutes each followed by 
washing with 2xSSC 4- 0.1% sodium dodecyl sulfate (SDS) , 
5 2.5 minutes total with one change of buffer. (20xSSC « 3M 
NaCl, 0.3N Na Citrate, pH 7,0). Membranes were then 
subjected to autoradiography to determine the linear map 
of the respective restriction sites. 

Probe was then stripped from the membrane by 

10 incubation in 2mM Na^ EDTA + 0*1% SDS (adjusted to pH8.3 
with Tris base) at 65 ''C for 10 minutes. Removal of probe 
was verified by autoradiography and the hybridization, 
washing and visualization process was repeated with a 
different radioactive probe. 

15 This method was shown to be highly specific with no 

background from cross-hybridization. It yielded accurate 
fine structure maps of both substrates. Significemtly, 
even within lanes which contained mixtures of the two 
targets, each pattern could be detected independently, 

20 sequentially, and with complete specificity. 

While the invention has been described in connection 
with specific embodiments thereof, it will be understood 
that it is capable of further modifications and this 
application is intended to cover any variations, uses, or 

25 adaptations of the invention following, in general, the 
principles of the invention and including such departures 
from the present disclosure as come within known or 
customary practice within the art to which the invention 
pertains and as may be applied to the essential features 

30 hereinbefore set forth and as follows in the scope of the 
appended claims. 
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1. A method for analyzing a target DNA molecsule, which 
comprises: 

(A) forming a recombinant molecule, said recombinant 
5 molecule comprising a probe/primer sequence linked to a 

recombinational site (I), wherein said site is linked to 
the sequence of said target molecule; and 

(B) analyzing said target molecule using a nucleic acid 
molecule capable ' of hybridizing to said probe/primer 

10 sequence, or its complement. 

2. The method of claim l, wherein said analysis 
comprises determining a nucleotide sequence of said target 
DNA molecule, wherein, in step (B), said analysis 
comprises determining the sequence of the target molecule 

15 using a nucleic acid molecule capable of hybridizing to 
said probe/primer sequence, or its complement. 

3. The method of claim 2, wherein said recombinant 

molecule is formed by: 

(1) introducing said target DNA molecule into at 
20 least one vector, having a recombinational site (II) , to 

thereby form a vector-target DNA construct; 

(2) incubating said vector-DNA construct in the 
presence of a recombinase, and a DNA molecule having said 
recombinational site (I) and said probe/primer region; 

25 wherein said incubation is under conditions sufficient to 
permit said recombinase to mediate recombination between 
said recombinational site (II) of said vector-target DNA 
construct and said recombinational site (I) of said DNA 
molecule; and 

30 (3) permitting said recombinase to mediate 

recombination between said recombinational sites, and to 
thereby form said sequencing molecule. 
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4. The method of claim 3, wherein at least two vector- 
target DNA constjructs are formed, and wherein at least two 
different DNA molecules each having a recombinational site 
(I) and further having a different probe/primer region are 

5 employed; and wherein said determining of the sequence of 
the teurget molecule is through use of two probes, each 
capable of hybridizing to only one of said probe/primer 
regions, or its complement* 

5. The method of claim 3, wherein said recombination 
10 is site-specific recombination. 

6. The method of claim 5, wherein in said site- 
specific recombination, said recombinase is Ore, and at 
least one of said recombinational sites (I) or (II) are 
loxP sites. 

15 7* The method of claim 6, wherein said recombinational 

site (I) is a loxP site. 

8* The method of claim 6, wherein said vector contains 
one wild-type loxP site and one mutant loxP site. 

9. The method of claim 1, wherein said, analysis 
20 comprises ordering restriction endonuclease recognition 
sites in a target ONA molecule, and wherein, in step (B) , 
said analysis comprises 

(i) incubating said recombinant molecule in the presence 
of a restriction endonuclease under conditions sufficient 

25 to permit said endonuclease to cleave DNA containing a 
cleavage site recognized by said endonuclease; and 

(ii) determining the order of any restriction sites in 
said target molecule using a nucleic acid molecule capable 
of hybridizing to said probe/primer sequence, or its 

30 complement. 
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10. The method of claim 9, ^rtierein said recombinant 

molecule is formed by; 

(1) introducing said target DNA molecule into at 
least one vector, having a recombinational site (II) , to 

5 thereby form a vector-target DNA construct; 

(2) incubating said vector-DNA construct in the 
presence of a recombinase, and a DNA molecule having said 
recombinational site (I) and said probe/primer region; 
wherein said incubation is under conditions sufficient to 

10 permit said recombinase to mediate recombination between 
said recombinational site (I) of said DNA molecule, and 
said recombinational site (II) of said vector-target DNA 

construct; and 

(3) permitting said recombinase to mediate 
15 recombination between said recombinational sites (I) and 

(II) , and to thereby form said recombinant molecule. 

11. The method of claim 10, wherein at least two 
vector-target DNA constructs are formed, and wherein at 
least two different DNA molecules each having a 

20 recombinational site (I) and further having a different 
probe/primer region are employed; and wherein said 
ordering of restriction sites of the target molecule is 
through use of two probes, each capable of hybridizing to 
only one of said probe/primer regions, or its complement. 

25 12. The method of claim 10, wherein said recombination 

is site-specific recombination. 

13. The method of claim 12, wherein in said site- 
specific recombination, said recombinase is Cre, and at 
least one of said recombinational sites (I) and (II) is a 
30 103CP site. 

^A. The method of claim 13, wherein said 
recombinational site (I) is a loxP site. 
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15. The method of claim* 13, wherein said vector 
contains one loxP site and one mutant loxP site. 

16. A kit specially adapted to mediate recombination 
between a DNA molecule having a recombinational site (I) , 

5 and a DNA vector, having a recombinational site (II) , said 
kit comprising in close compartmentalization: 

1) a first container containing a recombinase 
capable of mediating said recombination between 
said site (I) of said DNA molecule and said 

10 site (II) of said vector; and 

2) a second container containing a DNA molecule 
having said recombinational site (I) . 

17. The kit of claim 16, wherein said recombinase is 
Ore, and wherein at least one of said recombinational 

15 sites (I) and (II) is a loxP site. 

18. The kit of claim 16, which additionally contains a 
third container containing said DNA vector. 

19. The kit of claim 18, wherein said vector contains 
one loxP site. 

20 20. The kit of claim 18, wherein said vector contains 

one wild-type loxP site and one mutant loxP site. 

21. A set of nested oligonucleotides each of which has 
a first region of unknown sequence, and a second region of 
known sequence, wherein said second region comprises both 

25 a recombinational site, and a probe/primer region. 

22. The set of oligonucleotides of claim 21, wherein at 
least one of said oligonucleotides is hybridized to a 
probe. 
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23. The set of oligonucleotides of claim 21, wherein at 
least one of said oligonucleotides is hybridized to a 
primer. 



24. oaie set of oligonucleotides of claim 21, wherein 
5 said recombinational site is a loxP site. 
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